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June 2004 Proceedings of the 2004 ACM SIGMOD i nternational conference on 
Management of data 
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Full text available: ^ pdf(355.41 KB) Additional Information: full citation, abstract , references 

Cardinality estimation during query optimization relies on simplifying assumptions that 
usually do not hold in practice. To diminish the impact of inaccurate estimates during 
optimization, statistics on query expressions (SITs) have been previously proposed. These 
statistics help directly model the distribution of tuples on query sub-plans. Past work in 
statistics on query expressions has exploited view matching technology to harness their 
benefits. In this paper we argue against such an approac ... 

2 Query execution and optimization: Weighted hypertree decompositions and optimal 
query plans 

Francesco Scarcello, Gianluigi Greco, Nicola Leone 
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Hypertree width [22, 25] is a measure of the degree of cyclicity of hypergraphs. A 
number of relevant problems from different areas, e.g., the evaluation of conjunctive 
queries in database theory or the constraint satisfaction in AI, are tractable when their 
underlying hypergraphs have bounded hypertree width. However, in practical contexts 
like the evaluation of database queries, we have more information besides the structure 
of queries. For instance, we know the number of tuples in relations, ... 
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This paper reports our experiences building the query optimizer for TI's Open OODB 
system. To the best of our knowledge, it is the first working object query optimizer to be 
based on a complete extensible optimization framework including logical algebra, 
execution algorithms, property enforcers, logical transformation rules, implementation 
rules, and selectivity and cost estimation. Our algebra incorporates a new materialize 
operator with its corresponding logical transform ... 
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In a large modern enterprise, information is almost inevitably distributed among several 
database management systems. Despite considerable attention from the research 
community, relatively few commercial systems have attempted to address this issue. This 
paper describes new technology that enables clients of IBM's DB2 Universal Database to 
access the data and specialized computational capabilities of a wide range of non- 
relational data sources. This technology, based on the Garlic prototype deve ... 

5 Optimizing multiple dimensional queries simultaneously in multidimensional 
databases 

Weifa Liang, Maria E. Orlowska, Jeffrey X. Yu 

February 2000 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 8 Issue 3-4 
Publisher: Springer-Verlag New York, Inc. 

Full text available: ^[ pdf(269.57 KB) Additional Information: full citation , abstract , citing s, index terms 

Some significant progress related to multidimensional data analysis has been achieved in 
the past few years, including the design of fast algorithms for computing datacubes, 
selecting some precomputed group-bys to materialize, and designing efficient storage 
structures for multidimensional data. However, little work has been carried out on 
multidimensional query optimization issues. Particularly the response time (or evaluation 
cost) for answering several related dimensional queries simultaneous ... 

Keywords: Data warehousing, MDDBs, Multiple dimensional query optimization, OLAP, 
Query modeling 
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Relational query optimizers have traditionally relied upon table cardinalities when 
estimating the cost of the query plans they consider. While this approach has been and 
continues to be successful, the advent of the Internet and the need to execute queries 
over streaming sources requires a different approach, since for streaming inputs the 
cardinality may not be known or may not even be knowable (as is the case for an 
unbounded stream.) In view of this, we propose shifting from a cardinality-ba ... 
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Multiway spatial joins 

Nikos Mamoulis, Dimitris Papadias 

December 2001 ACM Transactions on Database Systems (TODS), Volume 26 issue 4 
Publisher: ACM Press 

Full text available* Ddf(2 04 MB) Additional Information: full citation , abstract , references , citings , index 
^ ~ terms , review 

Due to the evolution of Geographical Information Systems, large collections of spatial 
data having various thematic contents are currently available. As a result, the interest of 
users is not limited to simple spatial selections and joins, but complex query types that 
implicate numerous spatial inputs become more common. Although several algorithms 
have been proposed for computing the result of pairwise spatial joins, limited work exists 
on processing and optimization of multiway spatial join ... 

Keywords: Multiway joins, query processing, spatial joins 
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Management of data 

Publisher: ACM Press 

Full text available* fi3 odfd 33 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

Statistics play an important role in influencing the plans produced by a query optimizer. 
Traditionally, optimizers use statistics built over base tables and assume independence 
between attributes while propagating statistical information through the query plan. This 
approach can introduce large estimation errors, which may result in the optimizer 
choosing inefficient execution plans. In this paper, we show how to extend a generic 
optimizer so that it also exploits statistics built on expression ... 

9 Research sessions: XML I: StatiX: making XML count j 
Juliana Freire, Jayant R. Haritsa, Maya Ramanath, Prasan Roy, Jerome Simeon 
June 2002 Proceedings of the 2002 ACM SIGMOD i nternational conference on 

Management of data 
Publisher: ACM Press 

Full text available* fg| pdfn.13 MB) Additional Information: full citation , abstract , references , citings , index 

terms 

The availability of summary data for XML documents has many applications, from 
providing users with quick feedback about their queries, to cost-based storage design and 
query optimization. StatiX is a novel XML Schema-aware statistics framework that exploits 
the structure derived by regular expressions (which define elements in an XML Schema) 
to pinpoint places in the schema that are likely sources of structural skew. As we discuss 
below, this information can be used to build conci ... 

10 Multiple-granularity interleaving for pi g g yback query processing | 
Brian Dunkel, Qiang Zhu, Wing Lau, Suyun Chen 

November 1999 Proceedings of the 1999 conference of the Centre for Adva need 
Studies on Collaborative research 

Publisher: IBM Press 

Full text available: ^[pdf(353.91 KB) Additional Information: full citation , abstract , references , index terms 
Piggyback query processing is a new technique, described in [24], intended to perform 
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additional useful computation (e.g., database statistics collection) during normal query 
processing, taking full advantage of data resident in main memory. Different types of 
benecial piggybacking have been identifed and studied, but how to efficiently integrate 
piggyback operations with a given user query is still an open issue. In this paper, we 
propose a technique of multiple-granularity interleaving to effi ... 

Keywords: database statistics, multiple-granularity interleaving, piggybacking, query 
optimization, query processing 
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Management of data 
Publisher: ACM Press 

Full text available- fij odf(209 06 KB) Additlonal Information: full citation , abstract , references , citings, index 
' I***- 4 — ^ terms 

The advent of XML as a universal exchange format, and of Web services as a basis for 
distributed computing, has fostered the apparition of a new class of documents: dynamic 
XML documents. These are XML documents where some data is given explicitly while 
other parts are given only intensionally by means of embedded calls to web services that 
can be called to generate the required information. By the sole presence of Web services, 
dynamic documents already include inherently some form of di ... 

12 Building knowledge base management systems 

John Mylopoulos, Vinay Chaudhri, Dimitris Plexousakis, Adel Shrufi, Thodoros Topologlou 
December 1996 The VLDB Journal — The International Journal on Very Large Data 

Bases, Volume 5 Issue 4 
Publisher: Springer-Verlag New York, Inc. 

Full text available: ^ pdf(4Q3.22 KB) Additional Information: full citation , abstract , citings , index terms 

Advanced applications in fields such as CAD, software engineering, real-time process 
control, corporate repositories and digital libraries require the construction, efficient 
access and management of large, shared knowledge bases. Such knowledge bases cannot 
be built using existing tools such as expert system shells, because these do not scale up, 
nor can they be built in terms of existing database technology, because such technology 
does not support the rich representational structure and infer ... 

Keywords: Concurrency control, Constraint enforcement, Knowledge base management 
systems, Rule management, Storage management 



13 MIL primitives fo r q uery ing a fra g mented world 
Peter A. Boncz, Martin L. Kersten 

October 1999 The VLDB Journal - The International Journal on Very Large Data 

Bases, Volume 8 Issue 2 
Publisher: Springer-Verlag New York, Inc. 

Full text available: ^g| pdf(261.36 KB) Additional Information: full citation , abstract , citings, index terms 

In query-intensive database application areas, like decision support and data mining, 
systems that use vertical fragmentation have a significant performance advantage. In 
order to support relational or object oriented applications on top of such a fragmented 
data model, a flexible yet powerful intermediate language is needed. This problem has 
been successfully tackled in Monet, a modern extensible database kernel developed by 
our group. We focus on the design choices made in the Monet interprete ... 
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15 Data transformation and duplicate detection: Execution of data mappers 
Paulo Carreira, Helena Galhardas 

June 2004 Proceedings of the 2004 international workshop on Information quality in 

information systems 
Publisher: ACM Press 

Full text available: ^ pdf(1 58.21 KB) Additional Information: full citation , abstract , references 

Data mappers are essential operators for implementing data transformations supporting 
schema mapping and integration scenarios such as legacy data migration, ETL processes 
for data warehousing, data cleaning activities, and business integration initiatives. Despite 
their widespread use, no formalization of this important operation has been proposed so 
far. In this paper we propose the data mapper operator as an extension to the relational 
algebra. We supply a set of algebrai ... 

16 Data integration and sharing I: Capturing both types and constraints in data j 
<g> integration 

v Michael Benedikt, Chee-Yong Chan, Wenfei Fan, Juliana Freire, Rajeev Rastogi 

June 2003 Proceedings of the 2003 ACM SIGMOD i nternational conference on 

Management of data 
Publisher: ACM Press 

Full text available: fiB pdf(690.62 KB ) Addltional Information: full citation , abstract, references , citings, index 

terms 

We propose a framework for integrating data from multiple relational sources into an XML 
document that both conforms to a given DTD and satisfies predefined XML constraints. 
The framework is based on a specification language, AIG, that extends a DTD by (1) 
associating element types with semantic attributes (inherited and synthesized, inspired by 
the corresponding notions from Attribute Grammars), (2) computing these attributes via 
parameterized SQL queries over multiple data sources, and (3) inc ... 

17 Optimization of dynamic query evaluation plans | 
Richard L. Cole, Goetz Graefe 

May 1994 ACM SIGMOD Record , Proceedings of the 1994 ACM SIGMOD international 

conference on Management of data SIGMOD '94, Volume 23 issue 2 
Publisher: ACM Press 

Full text available* f*3 odfd 45 MB) Additional Information: full citation , abstract , references , citings, index 

terms 

Traditional query optimizers assume accurate knowledge of run-time parameters such as 
selectivities and resource availability during plan optimization, i.e., at compile time. In 
reality, however, this assumption is often not justified. Therefore, the "static" plans 
produced by traditional optimizers may not be optimal for many of their actual run-time 
invocations. Instead, we propose a novel optimization model that assigns the bulk of the 
optinhization effort to compile-time and ... 
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June 2004 Proceedings of the 2004 ACM SIGMOD i nternational conference on 

Management of data 
Publisher: ACM Press 

Full text available: ^) pdf(1 97.27 KB) Additional Information: full citation , abstract , references 

An effective query optimizer finds a query plan that exploits the characteristics of the 
source data. In data integration, little is known in advance about sources' properties, 
which necessitates the use of adaptive query processing techniques to adjust query 
processing on-the-fly. Prior work in adaptive query processing has focused on 
compensating for delays and adjusting for mis-estimated cardinality or selectivity values. 
In this paper, we present a generalized architecture for adaptiv ... 

19 GPGPU: general purpose computation on graphics hardware 
David Luebke, Mark Harris, Jens Kriiger, Tim Purcell, Naga Govindaraju, Ian Buck, Cliff 
Woolley, Aaron Lefohn 

August 2004 Proceedings of the conference on SIGGRAPH 2004 course notes GRAPH 
'04 

Publisher: ACM Press 

Full text available: ^pdf(63.03 MB) Additional Information: full citation , abstract 

The graphics processor (GPU) on today's commodity video cards has evolved into an 
extremely powerful and flexible processor. The latest graphics architectures provide 
tremendous memory bandwidth and computational horsepower, with fully programmable 
vertex and pixel processing units that support vector operations up to full IEEE floating 
point precision. High level languages have emerged for graphics hardware, making this 
computational power accessible. Architecturally, GPUs are highly parallel s ... 

20 Research papers: o ptimization: RankSQL: query algebra and optimization for 
relational top-k queries 

Chengkai Li, Kevin Chen-Chuan Chang, Ihab F. Ilyas, Sumin Song 
June 2005 Proceedings of the 2005 ACM SIGMOD i nternational conference on 

Management of data 
Publisher: ACM Press 

Full text available: ^ pdf(741 .54 KB) Additional Information: full citation, abstract, references 

This paper introduces RankSQL, a system that provides a systematic and principled 
framework to support efficient evaluations of ranking (top-k) queries in relational 
database systems (RDBMS), by extending relational algebra and query optimization. 
Previously, top-k query processing is studied in the middleware scenario or in RDBMS in a 
"piecemeal" fashion, I.e., focusing on specific operator or sitting outside the core of query 
engines. In contrast, we aim to support ranking ... 
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